Search | VHL Regional Portal

1.

Hagfish genome elucidates vertebrate whole-genome duplication events and their evolutionary consequences.

Yu, Daqi; Ren, Yandong; Uesaka, Masahiro; Beavan, Alan J S; Muffato, Matthieu; Shen, Jieyu; Li, Yongxin; Sato, Iori; Wan, Wenting; Clark, James W; Keating, Joseph N; Carlisle, Emily M; Dearden, Richard P; Giles, Sam; Randle, Emma; Sansom, Robert S; Feuda, Roberto; Fleming, James F; Sugahara, Fumiaki; Cummins, Carla; Patricio, Mateus; Akanni, Wasiu; D'Aniello, Salvatore; Bertolucci, Cristiano; Irie, Naoki; Alev, Cantas; Sheng, Guojun; de Mendoza, Alex; Maeso, Ignacio; Irimia, Manuel; Fromm, Bastian; Peterson, Kevin J; Das, Sabyasachi; Hirano, Masayuki; Rast, Jonathan P; Cooper, Max D; Paps, Jordi; Pisani, Davide; Kuratani, Shigeru; Martin, Fergal J; Wang, Wen; Donoghue, Philip C J; Zhang, Yong E; Pascual-Anaya, Juan.

Nat Ecol Evol ; 8(3): 519-535, 2024 Mar.

Article in English | MEDLINE | ID: mdl-38216617

ABSTRACT

Polyploidy or whole-genome duplication (WGD) is a major event that drastically reshapes genome architecture and is often assumed to be causally associated with organismal innovations and radiations. The 2R hypothesis suggests that two WGD events (1R and 2R) occurred during early vertebrate evolution. However, the timing of the 2R event relative to the divergence of gnathostomes (jawed vertebrates) and cyclostomes (jawless hagfishes and lampreys) is unresolved and whether these WGD events underlie vertebrate phenotypic diversification remains elusive. Here we present the genome of the inshore hagfish, Eptatretus burgeri. Through comparative analysis with lamprey and gnathostome genomes, we reconstruct the early events in cyclostome genome evolution, leveraging insights into the ancestral vertebrate genome. Genome-wide synteny and phylogenetic analyses support a scenario in which 1R occurred in the vertebrate stem-lineage during the early Cambrian, and 2R occurred in the gnathostome stem-lineage, maximally in the late Cambrian-earliest Ordovician, after its divergence from cyclostomes. We find that the genome of stem-cyclostomes experienced an additional independent genome triplication. Functional genomic and morphospace analyses demonstrate that WGD events generally contribute to developmental evolution with similar changes in the regulatory genome of both vertebrate groups. However, appreciable morphological diversification occurred only in the gnathostome but not in the cyclostome lineage, calling into question the general expectation that WGDs lead to leaps of bodyplan complexity.

Subject(s)

Hagfishes , Animals , Phylogeny , Hagfishes/genetics , Gene Duplication , Vertebrates/genetics , Genome , Lampreys/genetics

2.

Drivers and determinants of strain dynamics following fecal microbiota transplantation.

Schmidt, Thomas S B; Li, Simone S; Maistrenko, Oleksandr M; Akanni, Wasiu; Coelho, Luis Pedro; Dolai, Sibasish; Fullam, Anthony; Glazek, Anna M; Hercog, Rajna; Herrema, Hilde; Jung, Ferris; Kandels, Stefanie; Orakov, Askarbek; Thielemann, Roman; von Stetten, Moritz; Van Rossum, Thea; Benes, Vladimir; Borody, Thomas J; de Vos, Willem M; Ponsioen, Cyriel Y; Nieuwdorp, Max; Bork, Peer.

Nat Med ; 28(9): 1902-1912, 2022 09.

Article in English | MEDLINE | ID: mdl-36109636

ABSTRACT

Fecal microbiota transplantation (FMT) is a therapeutic intervention for inflammatory diseases of the gastrointestinal tract, but its clinical mode of action and subsequent microbiome dynamics remain poorly understood. Here we analyzed metagenomes from 316 FMTs, sampled pre and post intervention, for the treatment of ten different disease indications. We quantified strain-level dynamics of 1,089 microbial species, complemented by 47,548 newly constructed metagenome-assembled genomes. Donor strain colonization and recipient strain resilience were mostly independent of clinical outcomes, but accurately predictable using LASSO-regularized regression models that accounted for host, microbiome and procedural variables. Recipient factors and donor-recipient complementarity, encompassing entire microbial communities to individual strains, were the main determinants of strain population dynamics, providing insights into the underlying processes that shape the post-FMT gut microbiome. Applying an ecology-based framework to our findings indicated parameters that may inform the development of more effective, targeted microbiome therapies in the future, and suggested how patient stratification can be used to enhance donor microbiota colonization or the displacement of recipient microbes in clinical practice.

Subject(s)

Clostridium Infections , Gastrointestinal Microbiome , Microbiota , Clostridium Infections/therapy , Fecal Microbiota Transplantation , Feces , Gastrointestinal Microbiome/genetics , Gastrointestinal Tract , Humans

3.

A faecal microbiota signature with high specificity for pancreatic cancer.

Kartal, Ece; Schmidt, Thomas S B; Molina-Montes, Esther; Rodríguez-Perales, Sandra; Wirbel, Jakob; Maistrenko, Oleksandr M; Akanni, Wasiu A; Alashkar Alhamwe, Bilal; Alves, Renato J; Carrato, Alfredo; Erasmus, Hans-Peter; Estudillo, Lidia; Finkelmeier, Fabian; Fullam, Anthony; Glazek, Anna M; Gómez-Rubio, Paulina; Hercog, Rajna; Jung, Ferris; Kandels, Stefanie; Kersting, Stephan; Langheinrich, Melanie; Márquez, Mirari; Molero, Xavier; Orakov, Askarbek; Van Rossum, Thea; Torres-Ruiz, Raul; Telzerow, Anja; Zych, Konrad; Benes, Vladimir; Zeller, Georg; Trebicka, Jonel; Real, Francisco X; Malats, Nuria; Bork, Peer.

Gut ; 71(7): 1359-1372, 2022 07.

Article in English | MEDLINE | ID: mdl-35260444

ABSTRACT

BACKGROUND: Recent evidence suggests a role for the microbiome in pancreatic ductal adenocarcinoma (PDAC) aetiology and progression. OBJECTIVE: To explore the faecal and salivary microbiota as potential diagnostic biomarkers. METHODS: We applied shotgun metagenomic and 16S rRNA amplicon sequencing to samples from a Spanish case-control study (n=136), including 57 cases, 50 controls, and 29 patients with chronic pancreatitis in the discovery phase, and from a German case-control study (n=76), in the validation phase. RESULTS: Faecal metagenomic classifiers performed much better than saliva-based classifiers and identified patients with PDAC with an accuracy of up to 0.84 area under the receiver operating characteristic curve (AUROC) based on a set of 27 microbial species, with consistent accuracy across early and late disease stages. Performance further improved to up to 0.94 AUROC when we combined our microbiome-based predictions with serum levels of carbohydrate antigen (CA) 19-9, the only current non-invasive, Food and Drug Administration approved, low specificity PDAC diagnostic biomarker. Furthermore, a microbiota-based classification model confined to PDAC-enriched species was highly disease-specific when validated against 25 publicly available metagenomic study populations for various health conditions (n=5792). Both microbiome-based models had a high prediction accuracy on a German validation population (n=76). Several faecal PDAC marker species were detectable in pancreatic tumour and non-tumour tissue using 16S rRNA sequencing and fluorescence in situ hybridisation. CONCLUSION: Taken together, our results indicate that non-invasive, robust and specific faecal microbiota-based screening for the early detection of PDAC is feasible.

Subject(s)

Carcinoma, Pancreatic Ductal , Microbiota , Pancreatic Neoplasms , Biomarkers, Tumor , CA-19-9 Antigen , Carcinoma, Pancreatic Ductal/diagnosis , Carcinoma, Pancreatic Ductal/genetics , Case-Control Studies , Humans , Pancreatic Neoplasms/diagnosis , Pancreatic Neoplasms/genetics , RNA, Ribosomal, 16S/genetics , Pancreatic Neoplasms

4.

Ensembl Genomes 2020-enabling non-vertebrate genomic research.

Howe, Kevin L; Contreras-Moreira, Bruno; De Silva, Nishadi; Maslen, Gareth; Akanni, Wasiu; Allen, James; Alvarez-Jarreta, Jorge; Barba, Matthieu; Bolser, Dan M; Cambell, Lahcen; Carbajo, Manuel; Chakiachvili, Marc; Christensen, Mikkel; Cummins, Carla; Cuzick, Alayne; Davis, Paul; Fexova, Silvie; Gall, Astrid; George, Nancy; Gil, Laurent; Gupta, Parul; Hammond-Kosack, Kim E; Haskell, Erin; Hunt, Sarah E; Jaiswal, Pankaj; Janacek, Sophie H; Kersey, Paul J; Langridge, Nick; Maheswari, Uma; Maurel, Thomas; McDowall, Mark D; Moore, Ben; Muffato, Matthieu; Naamati, Guy; Naithani, Sushma; Olson, Andrew; Papatheodorou, Irene; Patricio, Mateus; Paulini, Michael; Pedro, Helder; Perry, Emily; Preece, Justin; Rosello, Marc; Russell, Matthew; Sitnik, Vasily; Staines, Daniel M; Stein, Joshua; Tello-Ruiz, Marcela K; Trevanion, Stephen J; Urban, Martin.

Nucleic Acids Res ; 48(D1): D689-D695, 2020 01 08.

Article in English | MEDLINE | ID: mdl-31598706

ABSTRACT

Ensembl Genomes (http://www.ensemblgenomes.org) is an integrating resource for genome-scale data from non-vertebrate species, complementing the resources for vertebrate genomics developed in the context of the Ensembl project (http://www.ensembl.org). Together, the two resources provide a consistent set of interfaces to genomic data across the tree of life, including reference genome sequence, gene models, transcriptional data, genetic variation and comparative analysis. Data may be accessed via our website, online tools platform and programmatic interfaces, with updates made four times per year (in synchrony with Ensembl). Here, we provide an overview of Ensembl Genomes, with a focus on recent developments. These include the continued growth, more robust and reproducible sets of orthologues and paralogues, and enriched views of gene expression and gene function in plants. Finally, we report on our continued deeper integration with the Ensembl project, which forms a key part of our future strategy for dealing with the increasing quantity of available genome-scale data across the tree of life.

Subject(s)

Computational Biology/methods , Databases, Genetic , Genetic Variation , Genome, Bacterial , Genome, Fungal , Genome, Plant , Algorithms , Animals , Caenorhabditis elegans/genetics , Genomics , Internet , Molecular Sequence Annotation , Phenotype , Plants/genetics , Reference Values , Software , User-Computer Interface

5.

Ensembl 2020.

Yates, Andrew D; Achuthan, Premanand; Akanni, Wasiu; Allen, James; Allen, Jamie; Alvarez-Jarreta, Jorge; Amode, M Ridwan; Armean, Irina M; Azov, Andrey G; Bennett, Ruth; Bhai, Jyothish; Billis, Konstantinos; Boddu, Sanjay; Marugán, José Carlos; Cummins, Carla; Davidson, Claire; Dodiya, Kamalkumar; Fatima, Reham; Gall, Astrid; Giron, Carlos Garcia; Gil, Laurent; Grego, Tiago; Haggerty, Leanne; Haskell, Erin; Hourlier, Thibaut; Izuogu, Osagie G; Janacek, Sophie H; Juettemann, Thomas; Kay, Mike; Lavidas, Ilias; Le, Tuan; Lemos, Diana; Martinez, Jose Gonzalez; Maurel, Thomas; McDowall, Mark; McMahon, Aoife; Mohanan, Shamika; Moore, Benjamin; Nuhn, Michael; Oheh, Denye N; Parker, Anne; Parton, Andrew; Patricio, Mateus; Sakthivel, Manoj Pandian; Abdul Salam, Ahamed Imran; Schmitt, Bianca M; Schuilenburg, Helen; Sheppard, Dan; Sycheva, Mira; Szuba, Marek.

Nucleic Acids Res ; 48(D1): D682-D688, 2020 01 08.

Article in English | MEDLINE | ID: mdl-31691826

ABSTRACT

The Ensembl (https://www.ensembl.org) is a system for generating and distributing genome annotation such as genes, variation, regulation and comparative genomics across the vertebrate subphylum and key model organisms. The Ensembl annotation pipeline is capable of integrating experimental and reference data from multiple providers into a single integrated resource. Here, we present 94 newly annotated and re-annotated genomes, bringing the total number of genomes offered by Ensembl to 227. This represents the single largest expansion of the resource since its inception. We also detail our continued efforts to improve human annotation, developments in our epigenome analysis and display, a new tool for imputing causal genes from genome-wide association studies and visualisation of variation within a 3D protein model. Finally, we present information on our new website. Both software and data are made available without restriction via our website, online tools platform and programmatic interfaces (available under an Apache 2.0 license) and data updates made available four times a year.

Subject(s)

Computational Biology/methods , Databases, Genetic , Epigenome , Molecular Sequence Annotation , Algorithms , Animals , Computer Graphics , Databases, Protein , Genetic Variation , Genome-Wide Association Study , Genomics , Histones/metabolism , Humans , Imaging, Three-Dimensional , Internet , Ligands , Search Engine , Software , Species Specificity , Transcriptome , User-Computer Interface , Web Browser

6.

Ensembl 2019.

Cunningham, Fiona; Achuthan, Premanand; Akanni, Wasiu; Allen, James; Amode, M Ridwan; Armean, Irina M; Bennett, Ruth; Bhai, Jyothish; Billis, Konstantinos; Boddu, Sanjay; Cummins, Carla; Davidson, Claire; Dodiya, Kamalkumar Jayantilal; Gall, Astrid; Girón, Carlos García; Gil, Laurent; Grego, Tiago; Haggerty, Leanne; Haskell, Erin; Hourlier, Thibaut; Izuogu, Osagie G; Janacek, Sophie H; Juettemann, Thomas; Kay, Mike; Laird, Matthew R; Lavidas, Ilias; Liu, Zhicheng; Loveland, Jane E; Marugán, José C; Maurel, Thomas; McMahon, Aoife C; Moore, Benjamin; Morales, Joannella; Mudge, Jonathan M; Nuhn, Michael; Ogeh, Denye; Parker, Anne; Parton, Andrew; Patricio, Mateus; Abdul Salam, Ahamed Imran; Schmitt, Bianca M; Schuilenburg, Helen; Sheppard, Dan; Sparrow, Helen; Stapleton, Eloise; Szuba, Marek; Taylor, Kieron; Threadgold, Glen; Thormann, Anja; Vullo, Alessandro.

Nucleic Acids Res ; 47(D1): D745-D751, 2019 01 08.

Article in English | MEDLINE | ID: mdl-30407521

ABSTRACT

The Ensembl project (https://www.ensembl.org) makes key genomic data sets available to the entire scientific community without restrictions. Ensembl seeks to be a fundamental resource driving scientific progress by creating, maintaining and updating reference genome annotation and comparative genomics resources. This year we describe our new and expanded gene, variant and comparative annotation capabilities, which led to a 50% increase in the number of vertebrate genomes we support. We have also doubled the number of available human variants and added regulatory regions for many mouse cell types and developmental stages. Our data sets and tools are available via the Ensembl website as well as a through a RESTful webservice, Perl application programming interface and as data files for download.

Subject(s)

Databases, Genetic , Genome/genetics , Genomics , Vertebrates/genetics , Animals , Computational Biology/trends , Humans , Mice , Molecular Sequence Annotation , Software

7.

Repeat associated mechanisms of genome evolution and function revealed by the Mus caroli and Mus pahari genomes.

Thybert, David; Roller, Masa; Navarro, Fábio C P; Fiddes, Ian; Streeter, Ian; Feig, Christine; Martin-Galvez, David; Kolmogorov, Mikhail; Janousek, Václav; Akanni, Wasiu; Aken, Bronwen; Aldridge, Sarah; Chakrapani, Varshith; Chow, William; Clarke, Laura; Cummins, Carla; Doran, Anthony; Dunn, Matthew; Goodstadt, Leo; Howe, Kerstin; Howell, Matthew; Josselin, Ambre-Aurore; Karn, Robert C; Laukaitis, Christina M; Jingtao, Lilue; Martin, Fergal; Muffato, Matthieu; Nachtweide, Stefanie; Quail, Michael A; Sisu, Cristina; Stanke, Mario; Stefflova, Klara; Van Oosterhout, Cock; Veyrunes, Frederic; Ward, Ben; Yang, Fengtang; Yazdanifar, Golbahar; Zadissa, Amonida; Adams, David J; Brazma, Alvis; Gerstein, Mark; Paten, Benedict; Pham, Son; Keane, Thomas M; Odom, Duncan T; Flicek, Paul.

Genome Res ; 28(4): 448-459, 2018 04.

Article in English | MEDLINE | ID: mdl-29563166

ABSTRACT

Understanding the mechanisms driving lineage-specific evolution in both primates and rodents has been hindered by the lack of sister clades with a similar phylogenetic structure having high-quality genome assemblies. Here, we have created chromosome-level assemblies of the Mus caroli and Mus pahari genomes. Together with the Mus musculus and Rattus norvegicus genomes, this set of rodent genomes is similar in divergence times to the Hominidae (human-chimpanzee-gorilla-orangutan). By comparing the evolutionary dynamics between the Muridae and Hominidae, we identified punctate events of chromosome reshuffling that shaped the ancestral karyotype of Mus musculus and Mus caroli between 3 and 6 million yr ago, but that are absent in the Hominidae. Hominidae show between four- and sevenfold lower rates of nucleotide change and feature turnover in both neutral and functional sequences, suggesting an underlying coherence to the Muridae acceleration. Our system of matched, high-quality genome assemblies revealed how specific classes of repeats can play lineage-specific roles in related species. Recent LINE activity has remodeled protein-coding loci to a greater extent across the Muridae than the Hominidae, with functional consequences at the species level such as reproductive isolation. Furthermore, we charted a Muridae-specific retrotransposon expansion at unprecedented resolution, revealing how a single nucleotide mutation transformed a specific SINE element into an active CTCF binding site carrier specifically in Mus caroli, which resulted in thousands of novel, species-specific CTCF binding sites. Our results show that the comparison of matched phylogenetic sets of genomes will be an increasingly powerful strategy for understanding mammalian biology.

Subject(s)

Evolution, Molecular , Genome/genetics , Muridae/genetics , Phylogeny , Animals , Binding Sites , CCCTC-Binding Factor/genetics , Chromosomes/genetics , Karyotyping/methods , Long Interspersed Nucleotide Elements/genetics , Mice , Retroelements/genetics , Species Specificity

8.

Ensembl 2018.

Zerbino, Daniel R; Achuthan, Premanand; Akanni, Wasiu; Amode, M Ridwan; Barrell, Daniel; Bhai, Jyothish; Billis, Konstantinos; Cummins, Carla; Gall, Astrid; Girón, Carlos García; Gil, Laurent; Gordon, Leo; Haggerty, Leanne; Haskell, Erin; Hourlier, Thibaut; Izuogu, Osagie G; Janacek, Sophie H; Juettemann, Thomas; To, Jimmy Kiang; Laird, Matthew R; Lavidas, Ilias; Liu, Zhicheng; Loveland, Jane E; Maurel, Thomas; McLaren, William; Moore, Benjamin; Mudge, Jonathan; Murphy, Daniel N; Newman, Victoria; Nuhn, Michael; Ogeh, Denye; Ong, Chuang Kee; Parker, Anne; Patricio, Mateus; Riat, Harpreet Singh; Schuilenburg, Helen; Sheppard, Dan; Sparrow, Helen; Taylor, Kieron; Thormann, Anja; Vullo, Alessandro; Walts, Brandon; Zadissa, Amonida; Frankish, Adam; Hunt, Sarah E; Kostadima, Myrto; Langridge, Nicholas; Martin, Fergal J; Muffato, Matthieu; Perry, Emily.

Nucleic Acids Res ; 46(D1): D754-D761, 2018 01 04.

Article in English | MEDLINE | ID: mdl-29155950

ABSTRACT

The Ensembl project has been aggregating, processing, integrating and redistributing genomic datasets since the initial releases of the draft human genome, with the aim of accelerating genomics research through rapid open distribution of public data. Large amounts of raw data are thus transformed into knowledge, which is made available via a multitude of channels, in particular our browser (http://www.ensembl.org). Over time, we have expanded in multiple directions. First, our resources describe multiple fields of genomics, in particular gene annotation, comparative genomics, genetics and epigenomics. Second, we cover a growing number of genome assemblies; Ensembl Release 90 contains exactly 100. Third, our databases feed simultaneously into an array of services designed around different use cases, ranging from quick browsing to genome-wide bioinformatic analysis. We present here the latest developments of the Ensembl project, with a focus on managing an increasing number of assemblies, supporting efforts in genome interpretation and improving our browser.

Subject(s)

Databases, Genetic , Datasets as Topic , Genome , Information Dissemination , Animals , Epigenomics , Genome, Human , Genome-Wide Association Study , Genomics , High-Throughput Nucleotide Sequencing , Humans , Molecular Sequence Annotation , Vertebrates/genetics , Web Browser

9.

Ensembl 2017.

Aken, Bronwen L; Achuthan, Premanand; Akanni, Wasiu; Amode, M Ridwan; Bernsdorff, Friederike; Bhai, Jyothish; Billis, Konstantinos; Carvalho-Silva, Denise; Cummins, Carla; Clapham, Peter; Gil, Laurent; Girón, Carlos García; Gordon, Leo; Hourlier, Thibaut; Hunt, Sarah E; Janacek, Sophie H; Juettemann, Thomas; Keenan, Stephen; Laird, Matthew R; Lavidas, Ilias; Maurel, Thomas; McLaren, William; Moore, Benjamin; Murphy, Daniel N; Nag, Rishi; Newman, Victoria; Nuhn, Michael; Ong, Chuang Kee; Parker, Anne; Patricio, Mateus; Riat, Harpreet Singh; Sheppard, Daniel; Sparrow, Helen; Taylor, Kieron; Thormann, Anja; Vullo, Alessandro; Walts, Brandon; Wilder, Steven P; Zadissa, Amonida; Kostadima, Myrto; Martin, Fergal J; Muffato, Matthieu; Perry, Emily; Ruffier, Magali; Staines, Daniel M; Trevanion, Stephen J; Cunningham, Fiona; Yates, Andrew; Zerbino, Daniel R; Flicek, Paul.

Nucleic Acids Res ; 45(D1): D635-D642, 2017 01 04.

Article in English | MEDLINE | ID: mdl-27899575

ABSTRACT

Ensembl (www.ensembl.org) is a database and genome browser for enabling research on vertebrate genomes. We import, analyse, curate and integrate a diverse collection of large-scale reference data to create a more comprehensive view of genome biology than would be possible from any individual dataset. Our extensive data resources include evidence-based gene and regulatory region annotation, genome variation and gene trees. An accompanying suite of tools, infrastructure and programmatic access methods ensure uniform data analysis and distribution for all supported species. Together, these provide a comprehensive solution for large-scale and targeted genomics applications alike. Among many other developments over the past year, we have improved our resources for gene regulation and comparative genomics, and added CRISPR/Cas9 target sites. We released new browser functionality and tools, including improved filtering and prioritization of genome variation, Manhattan plot visualization for linkage disequilibrium and eQTL data, and an ontology search for phenotypes, traits and disease. We have also enhanced data discovery and access with a track hub registry and a selection of new REST end points. All Ensembl data are freely released to the scientific community and our source code is available via the open source Apache 2.0 license.

Subject(s)

Computational Biology/methods , Databases, Genetic , Genomics/methods , Search Engine , Software , Web Browser , Animals , Data Mining , Evolution, Molecular , Gene Expression Regulation , Genetic Variation , Genome, Human , Humans , Molecular Sequence Annotation , Species Specificity , Vertebrates

10.

Ensembl 2016.

Yates, Andrew; Akanni, Wasiu; Amode, M Ridwan; Barrell, Daniel; Billis, Konstantinos; Carvalho-Silva, Denise; Cummins, Carla; Clapham, Peter; Fitzgerald, Stephen; Gil, Laurent; Girón, Carlos García; Gordon, Leo; Hourlier, Thibaut; Hunt, Sarah E; Janacek, Sophie H; Johnson, Nathan; Juettemann, Thomas; Keenan, Stephen; Lavidas, Ilias; Martin, Fergal J; Maurel, Thomas; McLaren, William; Murphy, Daniel N; Nag, Rishi; Nuhn, Michael; Parker, Anne; Patricio, Mateus; Pignatelli, Miguel; Rahtz, Matthew; Riat, Harpreet Singh; Sheppard, Daniel; Taylor, Kieron; Thormann, Anja; Vullo, Alessandro; Wilder, Steven P; Zadissa, Amonida; Birney, Ewan; Harrow, Jennifer; Muffato, Matthieu; Perry, Emily; Ruffier, Magali; Spudich, Giulietta; Trevanion, Stephen J; Cunningham, Fiona; Aken, Bronwen L; Zerbino, Daniel R; Flicek, Paul.

Nucleic Acids Res ; 44(D1): D710-6, 2016 Jan 04.

Article in English | MEDLINE | ID: mdl-26687719

ABSTRACT

The Ensembl project (http://www.ensembl.org) is a system for genome annotation, analysis, storage and dissemination designed to facilitate the access of genomic annotation from chordates and key model organisms. It provides access to data from 87 species across our main and early access Pre! websites. This year we introduced three newly annotated species and released numerous updates across our supported species with a concentration on data for the latest genome assemblies of human, mouse, zebrafish and rat. We also provided two data updates for the previous human assembly, GRCh37, through a dedicated website (http://grch37.ensembl.org). Our tools, in particular the VEP, have been improved significantly through integration of additional third party data. REST is now capable of larger-scale analysis and our regulatory data BioMart can deliver faster results. The website is now capable of displaying long-range interactions such as those found in cis-regulated datasets. Finally we have launched a website optimized for mobile devices providing views of genes, variants and phenotypes. Our data is made available without restriction and all code is available from our GitHub organization site (http://github.com/Ensembl) under an Apache 2.0 license.

Subject(s)

Databases, Genetic , Genomics , Molecular Sequence Annotation , Animals , Genes , Genetic Variation , Humans , Internet , Mice , Proteins/genetics , Rats , Regulatory Sequences, Nucleic Acid , Software

11.

Implementing and testing Bayesian and maximum-likelihood supertree methods in phylogenetics.

Akanni, Wasiu A; Wilkinson, Mark; Creevey, Christopher J; Foster, Peter G; Pisani, Davide.

R Soc Open Sci ; 2(8): 140436, 2015 Aug.

Article in English | MEDLINE | ID: mdl-26361544

ABSTRACT

Since their advent, supertrees have been increasingly used in large-scale evolutionary studies requiring a phylogenetic framework and substantial efforts have been devoted to developing a wide variety of supertree methods (SMs). Recent advances in supertree theory have allowed the implementation of maximum likelihood (ML) and Bayesian SMs, based on using an exponential distribution to model incongruence between input trees and the supertree. Such approaches are expected to have advantages over commonly used non-parametric SMs, e.g. matrix representation with parsimony (MRP). We investigated new implementations of ML and Bayesian SMs and compared these with some currently available alternative approaches. Comparisons include hypothetical examples previously used to investigate biases of SMs with respect to input tree shape and size, and empirical studies based either on trees harvested from the literature or on trees inferred from phylogenomic scale data. Our results provide no evidence of size or shape biases and demonstrate that the Bayesian method is a viable alternative to MRP and other non-parametric methods. Computation of input tree likelihoods allows the adoption of standard tests of tree topologies (e.g. the approximately unbiased test). The Bayesian approach is particularly useful in providing support values for supertree clades in the form of posterior probabilities.

12.

Horizontal gene flow from Eubacteria to Archaebacteria and what it means for our understanding of eukaryogenesis.

Akanni, Wasiu A; Siu-Ting, Karen; Creevey, Christopher J; McInerney, James O; Wilkinson, Mark; Foster, Peter G; Pisani, Davide.

Philos Trans R Soc Lond B Biol Sci ; 370(1678): 20140337, 2015 09 26.

Article in English | MEDLINE | ID: mdl-26323767

ABSTRACT

The origin of the eukaryotic cell is considered one of the major evolutionary transitions in the history of life. Current evidence strongly supports a scenario of eukaryotic origin in which two prokaryotes, an archaebacterial host and an α-proteobacterium (the free-living ancestor of the mitochondrion), entered a stable symbiotic relationship. The establishment of this relationship was associated with a process of chimerization, whereby a large number of genes from the α-proteobacterial symbiont were transferred to the host nucleus. A general framework allowing the conceptualization of eukaryogenesis from a genomic perspective has long been lacking. Recent studies suggest that the origins of several archaebacterial phyla were coincident with massive imports of eubacterial genes. Although this does not indicate that these phyla originated through the same process that led to the origin of Eukaryota, it suggests that Archaebacteria might have had a general propensity to integrate into their genomes large amounts of eubacterial DNA. We suggest that this propensity provides a framework in which eukaryogenesis can be understood and studied in the light of archaebacterial ecology. We applied a recently developed supertree method to a genomic dataset composed of 392 eubacterial and 51 archaebacterial genera to test whether large numbers of genes flowing from Eubacteria are indeed coincident with the origin of major archaebacterial clades. In addition, we identified two potential large-scale transfers of uncertain directionality at the base of the archaebacterial tree. Our results are consistent with previous findings and seem to indicate that eubacterial gene imports (particularly from Î´-Proteobacteria, Clostridia and Actinobacteria) were an important factor in archaebacterial history. Archaebacteria seem to have long relied on Eubacteria as a source of genetic diversity, and while the precise mechanism that allowed these imports is unknown, we suggest that our results support the view that processes comparable to those through which eukaryotes emerged might have been common in archaebacterial history.

Subject(s)

Bacteria/genetics , Biological Evolution , Gene Flow , Genome, Bacterial , Models, Genetic

13.

L.U.St: a tool for approximated maximum likelihood supertree reconstruction.

Akanni, Wasiu A; Creevey, Christopher J; Wilkinson, Mark; Pisani, Davide.

BMC Bioinformatics ; 15: 183, 2014 Jun 12.

Article in English | MEDLINE | ID: mdl-24925766

ABSTRACT

BACKGROUND: Supertrees combine disparate, partially overlapping trees to generate a synthesis that provides a high level perspective that cannot be attained from the inspection of individual phylogenies. Supertrees can be seen as meta-analytical tools that can be used to make inferences based on results of previous scientific studies. Their meta-analytical application has increased in popularity since it was realised that the power of statistical tests for the study of evolutionary trends critically depends on the use of taxon-dense phylogenies. Further to that, supertrees have found applications in phylogenomics where they are used to combine gene trees and recover species phylogenies based on genome-scale data sets. RESULTS: Here, we present the L.U.St package, a python tool for approximate maximum likelihood supertree inference and illustrate its application using a genomic data set for the placental mammals. L.U.St allows the calculation of the approximate likelihood of a supertree, given a set of input trees, performs heuristic searches to look for the supertree of highest likelihood, and performs statistical tests of two or more supertrees. To this end, L.U.St implements a winning sites test allowing ranking of a collection of a-priori selected hypotheses, given as a collection of input supertree topologies. It also outputs a file of input-tree-wise likelihood scores that can be used as input to CONSEL for calculation of standard tests of two trees (e.g. Kishino-Hasegawa, Shimidoara-Hasegawa and Approximately Unbiased tests). CONCLUSION: This is the first fully parametric implementation of a supertree method, it has clearly understood properties, and provides several advantages over currently available supertree approaches. It is easy to implement and works on any platform that has python installed. AVAILABILITY: bitBucket page - https://afro-juju@bitbucket.org/afro-juju/l.u.st.git. CONTACT: Davide.Pisani@bristol.ac.uk.

Subject(s)

Likelihood Functions , Algorithms , Animals , Genomics , Humans

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

SEND TO:

SELECTION OF CITATIONS

SEARCH DETAIL